LCORE-1422: Inline RAG (BYOK) e2e tests#1370
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds inline BYOK RAG (FAISS) support and tests: introduces FAISS env var to Compose files, new inline-FAISS Lightspeed YAMLs for library/server modes, E2E feature and step definitions, test hooks to swap configs, unit test for env-var resolution, and env-var-aware duplicate detection in vector-store construction. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Scenario
participant Env as Test Environment
participant LCS as Lightspeed Stack
participant FAISS as FAISS Vector DB
participant LLM as OpenAI LLM
Test->>Env: Start `@InlineRAG` feature
Env->>Env: Backup default lightspeed config
Env->>Env: Load inline-rag YAML (library/server)
Env->>LCS: Restart container with inline RAG config
Test->>LCS: GET /v1/rags (auth)
LCS->>LCS: List registered RAGs (includes e2e-test-docs)
LCS-->>Test: 200 + rags: ["e2e-test-docs"]
Test->>LCS: POST /v1/query (with system_prompt)
LCS->>FAISS: Search vectors for query
FAISS-->>LCS: Return matching chunks
LCS->>LLM: Invoke LLM with retrieved context
LLM-->>LCS: Generated response
LCS-->>Test: 200 + response (includes rag_chunks)
Test->>Env: Finish feature
Env->>LCS: Restore default config and restart
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/e2e/configuration/library-mode/lightspeed-stack.yaml (1)
29-29: Use an isolated E2E DB path to avoid cross-run state leakage.Line 29 points to a shared home-path DB, which can retain prior test state and weaken deterministic E2E behavior.
♻️ Proposed change
- db_path: ~/.llama/storage/rag/kv_store.db + db_path: /tmp/lightspeed/rag/${env.FAISS_VECTOR_STORE_ID}/kv_store.db🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml` at line 29, The config uses a shared home-path under the db_path key which can leak state between runs; update the db_path entry in lightspeed-stack.yaml to point to an isolated per-run test location (for example a temp directory under the repo like tests/e2e/tmp/kv_store.db or use an env var such as TEST_DB_PATH that the e2e runner populates), ensuring the test harness creates/cleans that directory before/after each run so each E2E run uses a fresh KV store.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 28: The vector_db_id field currently uses the env placeholder
${env.FAISS_VECTOR_STORE_ID} with no fallback, which can leave the literal
placeholder in config and break FAISS lookups; update the lightspeed-stack
configuration to provide a safe fallback (e.g., use the env fallback syntax like
${env.FAISS_VECTOR_STORE_ID:=<default>} or programmatically validate/abort when
FAISS_VECTOR_STORE_ID is missing) and/or add a clear documentation/validation
step for library-mode entrypoints to require FAISS_VECTOR_STORE_ID; target the
vector_db_id setting and the FAISS_VECTOR_STORE_ID environment usage in the
stack config and any launch validation logic (entrypoint/launcher bootstrap) to
implement this change.
---
Nitpick comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 29: The config uses a shared home-path under the db_path key which can
leak state between runs; update the db_path entry in lightspeed-stack.yaml to
point to an isolated per-run test location (for example a temp directory under
the repo like tests/e2e/tmp/kv_store.db or use an env var such as TEST_DB_PATH
that the e2e runner populates), ensuring the test harness creates/cleans that
directory before/after each run so each E2E run uses a fresh KV store.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 08affa37-389a-4c86-8438-a9753c02d591
📒 Files selected for processing (6)
tests/e2e/configs/run-ci.yamltests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yamltests/e2e/configuration/library-mode/lightspeed-stack.yamltests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yamltests/e2e/configuration/server-mode/lightspeed-stack.yamltests/e2e/features/faiss.feature
✅ Files skipped from review due to trivial changes (2)
- tests/e2e/features/faiss.feature
- tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/e2e/configuration/library-mode/lightspeed-stack.yaml (1)
28-28:⚠️ Potential issue | 🟠 Major
FAISS_VECTOR_STORE_IDshould not rely on implicit launcher wiring.Line 28 still has no fallback/guard (
${env.FAISS_VECTOR_STORE_ID}), so misconfigured launchers can end up with unresolved placeholder behavior at runtime. Please either add a deterministic default via:=...or enforce explicit startup validation for this env var.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml` at line 28, The YAML uses an unresolved environment placeholder for vector_db_id (${env.FAISS_VECTOR_STORE_ID}) which can lead to runtime unresolved values; either provide a deterministic default in the template (use the := form so vector_db_id: ${env.FAISS_VECTOR_STORE_ID:=your-default-id}) or add explicit startup validation that checks process.env.FAISS_VECTOR_STORE_ID and fails fast with a clear error before using vector_db_id (update the launcher/startup validation logic that reads FAISS_VECTOR_STORE_ID to validate presence and exit with a helpful message).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/e2e/configuration/library-mode/lightspeed-stack.yaml`:
- Line 28: The YAML uses an unresolved environment placeholder for vector_db_id
(${env.FAISS_VECTOR_STORE_ID}) which can lead to runtime unresolved values;
either provide a deterministic default in the template (use the := form so
vector_db_id: ${env.FAISS_VECTOR_STORE_ID:=your-default-id}) or add explicit
startup validation that checks process.env.FAISS_VECTOR_STORE_ID and fails fast
with a clear error before using vector_db_id (update the launcher/startup
validation logic that reads FAISS_VECTOR_STORE_ID to validate presence and exit
with a helpful message).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 93417b3f-b5cb-4a4c-8d6e-0ee974b3bd7d
📒 Files selected for processing (7)
tests/e2e/configs/run-ci.yamltests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yamltests/e2e/configuration/library-mode/lightspeed-stack-inline-rag.yamltests/e2e/configuration/library-mode/lightspeed-stack.yamltests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yamltests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yamltests/e2e/configuration/server-mode/lightspeed-stack.yaml
✅ Files skipped from review due to trivial changes (2)
- tests/e2e/configuration/library-mode/lightspeed-stack-inline-rag.yaml
- tests/e2e/configuration/server-mode/lightspeed-stack-inline-rag.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/e2e/configuration/server-mode/lightspeed-stack.yaml
- tests/e2e/configuration/server-mode/lightspeed-stack-auth-noop-token.yaml
- tests/e2e/configs/run-ci.yaml
- tests/e2e/configuration/library-mode/lightspeed-stack-auth-noop-token.yaml
tisnik
left a comment
There was a problem hiding this comment.
It would be better not to introduce "magic" tags that change feature behaviours. Please use the standard approach - Background of feature or Given steps. TY in advance
e671522 to
94f1c37
Compare
are-ces
left a comment
There was a problem hiding this comment.
Removed magic attribute
a8efb6f to
db6d857
Compare
db6d857 to
cda64ae
Compare
src/llama_stack_configuration.py
Outdated
| return output | ||
|
|
||
|
|
||
| def _resolve_env_var(value: str) -> str: |
There was a problem hiding this comment.
does not resolve_env implement it?
There was a problem hiding this comment.
I see, there is a method in the llama-stack core library
There was a problem hiding this comment.
addressed in new commit
|
|
||
| mode_dir = "library-mode" if context.is_library_mode else "server-mode" | ||
| if is_prow_environment(): | ||
| config_path = f"tests/e2e-prow/rhoai/configs/{config_name}" |
There was a problem hiding this comment.
@radofuchs does it make sense to setup those paths in Background or Given?
There was a problem hiding this comment.
we can set it up partially, but we still need to differentiate between library and server mode in the same test
| config_path = f"tests/e2e/configuration/{mode_dir}/{config_name}" | ||
| create_config_backup("lightspeed-stack.yaml") | ||
| switch_config(config_path) | ||
| restart_container("lightspeed-stack") |
There was a problem hiding this comment.
for the same reason that is here for creating this step, we should also create a separate step for the lightspeed-stack restart as we are hiding this functionality from the user who would want to reproduce the behavior
There was a problem hiding this comment.
addressed in new commit
| @@ -0,0 +1,41 @@ | |||
| name: Lightspeed Core Service (LCS) | |||
There was a problem hiding this comment.
create a copy of these new configs also in e2e-prow
There was a problem hiding this comment.
addressed in new commit
| """ | ||
| {"query": "What does Paul Graham say about great work?"} | ||
| """ | ||
| Then The status code of the response is 200 |
There was a problem hiding this comment.
why did you skip the verification of the response content in this case?
There was a problem hiding this comment.
Thought separate tests should test different things, but I can update the test to get more consistency in the results
| Scenario: Responses API with inline RAG returns relevant content | ||
| Given The system is in default state | ||
| And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva | ||
| When I use "responses" to ask question with authorization header |
There was a problem hiding this comment.
add here also test where stream is set to true to cover all existing cases
There was a problem hiding this comment.
addressed in new commit, TY
| And The service uses the lightspeed-stack-inline-rag.yaml configuration | ||
|
|
||
| Scenario: Check if inline RAG source is registered | ||
| Given The system is in default state |
There was a problem hiding this comment.
first two lines of every test in this feature are identical. You can move them to background to improve readability
There was a problem hiding this comment.
addressed in new commit
| restart_container("lightspeed-stack") | ||
| # Library mode needs extra time to load embedding models after restart | ||
| wait_for_container_health("lightspeed-stack", max_attempts=12) | ||
|
|
There was a problem hiding this comment.
you never return to the original config file here, set some flag here that will trigger cleanup at the end of the scenario/feature
tisnik
left a comment
There was a problem hiding this comment.
LGTM from my perspective, but @radofuchs concerns should be solved too
Add end-to-end tests for inline RAG with BYOK configuration. Includes test configs for both server and library modes, feature file with 3 scenarios, and environment hooks for config switching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The enrichment skipped duplicates by comparing vector_store_id values,
but existing entries using ${env.VAR} patterns were compared as literal
strings against resolved values from Pydantic config, causing false
negatives and duplicate registration crashes in library mode.
Also switch inline RAG test configs to noop auth so the Docker health
check works correctly during container restarts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation from run-ci Remove vector_stores pre-registration from run-ci.yaml so BYOK RAG is managed entirely through lightspeed-stack configs. Add byok_rag with rag.tool mode to default and auth-noop-token configs for both server and library modes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nfig from run-ci
Use ${env.KV_RAG_PATH:=~/.llama/storage/rag/kv_store.db} for BYOK
db_path so tilde is properly resolved inside the container. Remove
the FAISS vector_io provider and kv_rag storage backend from run-ci
since the BYOK enrichment now creates them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ds modules The llama-stack enrichment entrypoint runs llama_stack_configuration.py which imports constants and log modules from src/. Setting PYTHONPATH ensures these resolve correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move inline_rag.feature right after faiss.feature in the test order. MCP tests call clear_llama_stack_storage() which rm -rf ~/.llama inside the container, destroying the bind-mounted kv_store.db needed by RAG tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the implicit @InlineRAG feature tag (handled in environment.py
hooks) with a parameterized Background step "The service uses the
{config_name} configuration" that switches config and restarts the
container. The step is reusable for any config file and idempotent
within a feature run via backup file existence check.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The "Inline RAG query includes referenced documents" scenario was only checking rag_chunks and response format without verifying referenced documents. Replace with a dedicated referenced_documents assertion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add streaming Responses API inline RAG e2e scenario (stream: true) - Copy inline RAG config to e2e-prow/rhoai/configs for prow compatibility - Extract restart into explicit "The service is restarted" Gherkin step - Replace custom _resolve_env_var with llama_stack replace_env_vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
d14567a to
fc8b5ec
Compare
* LCORE-1422: Inline RAG (BYOK) e2e tests
- Add 6 e2e scenarios: query, streaming query, responses API, streaming responses API, referenced documents, RAG source registration
- Add inline RAG configs for server, library, and prow modes
- Add reusable "The service uses the {config} configuration" Background step
- Fix BYOK RAG enrichment duplicate detection with env var references
- Add after_feature safety net to restore config backups
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Description
Add end-to-end tests for inline RAG (BYOK) to verify that deterministic context injection works correctly through the lightspeed-stack configuration.
Scenarios:
/v1/ragswith user-facingrag_idrag_chunksfrom inline context injection (no tool calls)Files added/changed:
lightspeed-stack-inline-rag.yaml)inline_rag.feature) with 3 scenariosrag_chunksassertion@InlineRAGconfig switchingFAISS_VECTOR_STORE_IDenv var added to lightspeed-stack container in docker-composeType of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run behave tests/e2e/features/inline_rag.feature— 3/3 scenarios passuv run behave tests/e2e/features/faiss.feature— 4/4 existing scenarios pass (no regression)uv run make formatanduv run make verify— all checks passSummary by CodeRabbit
Tests
Configuration
Chores